Technical report on Automatic Identification of Paedophile

نویسندگان

  • Christian Belbèze
  • David Chavalarias
  • Ludovic Denoyer
  • Raphaël Fournier
  • Jean-Loup Guillaume
  • Matthieu Latapy
  • Clémence Magnien
  • Guillaume Valadon
  • Vasja Vehovar
  • Aleš Žiberna
چکیده

Accurate and up-to-date knowledge of keywords entered by users who search or provide paedophile content is a key resource for filtering purposes and for monitoring by law enforcement institutions. However, such keywords are often hidden and may change frequently, and our current knowledge about them relies on manual inspection and field expertise. We explore here the possibility to help in improving this situation by applying various keyword analysis methods. Using a large-scale real-world collection of paedophile and non-paedophile file names, we construct lists of keywords suspected to be used as paedophile keywords. We evaluate the relevance and interest of these lists by submitting them to experts, thus showing that automatic approaches are indeed of great interest for this task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Technical report on the Automatic Detection of Paedophile Queries

Filtering or identifying paedophile queries is a key issue for law enforcement and search engines. However, these queries are in general mixed with a huge amount of other queries. Moreover, little is known on their characteristics. We address here these two issues in order to design the first tool for automatic detection of paedophile queries. Using domain expertise, we select some paedophile q...

متن کامل

Technical report on Maps of paedophile activity

As policy-making and law enforcement institutions generally operate at the national level, or at least at a regional level (Europe for instance), we studied geolocated recordings available in a large dataset obtained by a measurement of keyword-based queries submitted to a large P2P server. We observed that the fractions of paedophile queries may be orders of magnitude larger in some countries ...

متن کامل

Dynamics of Paedophile Keywords in eDonkey Queries

This technical report synthesizes the results of the analysis of paedophile keywords’ dynamics in two sets of eDonkey queries, collected during several months in 2007 and 2009 respectively. The goal of this work is to study the evolution of paedophile keywords’ frequency and popularity over several weeks (i.e. within a given dataset), as well as between the two different datasets. Moreover, spe...

متن کامل

Measurement and Analysis of P2P Activity Against Paedophile Content

Peer-to-peer (P2P) systems are nowadays widely used to exchange files, and it is acknowledged that they host much paedophile activity. However, current knowledge of this specific activity remains very limited, and almost no tool exist for user protection. Likewise, tools and knowledge for policy making and law enforcement are far from sufficient. The goal of the Measurement and Analysis of P2P ...

متن کامل

First Report on Paedophile Keywords Observed in eDonkey

This report presents our first analysis results on paedophile keywords observed in exchanges between eDonkey clients and their server. We first describe our dataset and the messages studied in this context. General statistics on the number of queries, filenames, clients and keywords are provided, before focusing on paedophile keywords appearing in user queries and/or in filenames. Statistical a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010